50 research outputs found
Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN
Collaborative robots are becoming more common on factory floors as well as
regular environments, however, their safety still is not a fully solved issue.
Collision detection does not always perform as expected and collision avoidance
is still an active research area. Collision avoidance works well for fixed
robot-camera setups, however, if they are shifted around, Eye-to-Hand
calibration becomes invalid making it difficult to accurately run many of the
existing collision avoidance algorithms. We approach the problem by presenting
a stand-alone system capable of detecting the robot and estimating its
position, including individual joints, by using a simple 2D colour image as an
input, where no Eye-to-Hand calibration is needed. As an extension of previous
work, a two-stage transfer learning approach is used to re-train a
multi-objective convolutional neural network (CNN) to allow it to be used with
heterogeneous robot arms. Our method is capable of detecting the robot in
real-time and new robot types can be added by having significantly smaller
training datasets compared to the requirements of a fully trained network. We
present data collection approach, the structure of the multi-objective CNN, the
two-stage transfer learning training and test results by using real robots from
Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible
application areas of our method together with the possible improvements.Comment: 6+n pages, ICRA 2019 submissio
Adaptive Context Encoding Module for Semantic Segmentation
The object sizes in images are diverse, therefore, capturing multiple scale
context information is essential for semantic segmentation. Existing context
aggregation methods such as pyramid pooling module (PPM) and atrous spatial
pyramid pooling (ASPP) design different pooling size or atrous rate, such that
multiple scale information is captured. However, the pooling sizes and atrous
rates are chosen manually and empirically. In order to capture object context
information adaptively, in this paper, we propose an adaptive context encoding
(ACE) module based on deformable convolution operation to argument multiple
scale information. Our ACE module can be embedded into other Convolutional
Neural Networks (CNN) easily for context aggregation. The effectiveness of the
proposed module is demonstrated on Pascal-Context and ADE20K datasets. Although
our proposed ACE only consists of three deformable convolution blocks, it
outperforms PPM and ASPP in terms of mean Intersection of Union (mIoU) on both
datasets. All the experiment study confirms that our proposed module is
effective as compared to the state-of-the-art methods
Spatial Orientation in Cardiac Ultrasound Images Using Mixed Reality: Design and Evaluation
Spatial orientation is an important skill in structural cardiac imaging. Until recently, 3D cardiac ultrasound has been visualized on a flat screen by using volume rendering. Mixed reality devices enhance depth perception, spatial awareness, interaction, and integration in the physical world, which can prove advantageous with 3D cardiac ultrasound images. In this work, we describe the design of a system for rendering 4D (3D + time) cardiac ultrasound data as virtual objects and evaluate it for ease of spatial orientation by comparing it with a standard clinical viewing platform in a user study. The user study required eight participants to do timed tasks and rate their experience. The results showed that virtual objects in mixed reality provided easier spatial orientation and morphological understanding despite lower perceived image quality. Participants familiar with mixed reality were quicker to orient in the tasks. This suggests that familiarity with the environment plays an important role, and with improved image quality and increased use, mixed reality applications may perform better than conventional 3D echocardiography viewing systems.publishedVersio
Transfer Learning for Unseen Robot Detection and Joint Estimation on a Multi-Objective Convolutional Neural Network
A significant problem of using deep learning techniques is the limited amount
of data available for training. There are some datasets available for the
popular problems like item recognition and classification or self-driving cars,
however, it is very limited for the industrial robotics field. In previous
work, we have trained a multi-objective Convolutional Neural Network (CNN) to
identify the robot body in the image and estimate 3D positions of the joints by
using just a 2D image, but it was limited to a range of robots produced by
Universal Robots (UR). In this work, we extend our method to work with a new
robot arm - Kuka LBR iiwa, which has a significantly different appearance and
an additional joint. However, instead of collecting large datasets once again,
we collect a number of smaller datasets containing a few hundred frames each
and use transfer learning techniques on the CNN trained on UR robots to adapt
it to a new robot having different shapes and visual features. We have proven
that transfer learning is not only applicable in this field, but it requires
smaller well-prepared training datasets, trains significantly faster and
reaches similar accuracy compared to the original method, even improving it on
some aspects.Comment: Regular paper submission to 2018 IEEE International Conference on
Intelligence and Safety Robotics (ISR). Camera Ready pape
Multi-Objective Convolutional Neural Networks for Robot Localisation and 3D Position Estimation in 2D Camera Images
The field of collaborative robotics and human-robot interaction often focuses
on the prediction of human behaviour, while assuming the information about the
robot setup and configuration being known. This is often the case with fixed
setups, which have all the sensors fixed and calibrated in relation to the rest
of the system. However, it becomes a limiting factor when the system needs to
be reconfigured or moved. We present a deep learning approach, which aims to
solve this issue. Our method learns to identify and precisely localise the
robot in 2D camera images, so having a fixed setup is no longer a requirement
and a camera can be moved. In addition, our approach identifies the robot type
and estimates the 3D position of the robot base in the camera image as well as
3D positions of each of the robot joints. Learning is done by using a
multi-objective convolutional neural network with four previously mentioned
objectives simultaneously using a combined loss function. The multi-objective
approach makes the system more flexible and efficient by reusing some of the
same features and diversifying for each objective in lower layers. A fully
trained system shows promising results in providing an accurate mask of where
the robot is located and an estimate of its base and joint positions in 3D. We
compare the results to our previous approach of using cascaded convolutional
neural networks.Comment: Ubiquitous Robots 2018 Regular paper submissio
Robot Localisation and 3D Position Estimation Using a Free-Moving Camera and Cascaded Convolutional Neural Networks
Many works in collaborative robotics and human-robot interaction focuses on
identifying and predicting human behaviour while considering the information
about the robot itself as given. This can be the case when sensors and the
robot are calibrated in relation to each other and often the reconfiguration of
the system is not possible, or extra manual work is required. We present a deep
learning based approach to remove the constraint of having the need for the
robot and the vision sensor to be fixed and calibrated in relation to each
other. The system learns the visual cues of the robot body and is able to
localise it, as well as estimate the position of robot joints in 3D space by
just using a 2D color image. The method uses a cascaded convolutional neural
network, and we present the structure of the network, describe our own
collected dataset, explain the network training and achieved results. A fully
trained system shows promising results in providing an accurate mask of where
the robot is located and a good estimate of its joints positions in 3D. The
accuracy is not good enough for visual servoing applications yet, however, it
can be sufficient for general safety and some collaborative tasks not requiring
very high precision. The main benefit of our method is the possibility of the
vision sensor to move freely. This allows it to be mounted on moving objects,
for example, a body of the person or a mobile robot working in the same
environment as the robots are operating in.Comment: Submission for IEEE AIM 2018 conference, 7 pages, 7 figures, ROBIN
group, University of Osl
Use of stereo-laparoscopic liver surface reconstruction to compensate for pneumoperitoneum deformation through biomechanical modeling.
International audienceAbdominal organs undergo large deformations due to intra-abdominal pressure (pneumoperitoneum) during laparoscopic surgery, especially large organs such as the liver [2]. These deformations cause large inaccuracies when using surgical navigation systems [2]. Fortunately, intra-operative imaging through CT/MRIcan be acquired in modern hybrid ORs as well as la-paroscopic ultrasound and can both be used to provide an updated organ models. However, these medical imaging modalities are expensive and may extendthe surgical workflow, hence, biomechanical models could be used as a solution for intra-operative regis-tration, also to account for organ deformations due to surgical manipulation. Within this study, we propose asolution to compensate for pneumoperitoneum, which could greatly increase the accuracy of liver surgical navigation systems
Towards a Video Quality Assessment based Framework for Enhancement of Laparoscopic Videos
Laparoscopic videos can be affected by different distortions which may impact
the performance of surgery and introduce surgical errors. In this work, we
propose a framework for automatically detecting and identifying such
distortions and their severity using video quality assessment. There are three
major contributions presented in this work (i) a proposal for a novel video
enhancement framework for laparoscopic surgery; (ii) a publicly available
database for quality assessment of laparoscopic videos evaluated by expert as
well as non-expert observers and (iii) objective video quality assessment of
laparoscopic videos including their correlations with expert and non-expert
scores.Comment: SPIE Medical Imaging 2020 (Draft version
Feasibility of a three-axis epicardial accelerometer in detecting myocardial ischemia in cardiac surgical patients
ObjectiveWe investigated the feasibility of continuous detection of myocardial ischemia during cardiac surgery with a 3-axis accelerometer.MethodsTen patients with significant left anterior descending coronary artery stenosis underwent off-pump coronary artery bypass grafting. A 3-axis accelerometer (11 × 14 × 5 mm) was sutured onto the left anterior descending coronary artery–perfused region of left ventricle. Twenty episodes of ischemia were studied, with 3-minute occlusion of left anterior descending coronary artery at start of surgery and 3-minute occlusion of left internal thoracic artery at end of surgery. Longitudinal, circumferential, and radial accelerations were continuously measured, with epicardial velocities calculated from the signals. During occlusion, accelerometer velocities were compared with anterior left ventricular longitudinal, circumferential, and radial strains obtained by echocardiography. Ischemia was defined by change in strain greater than 30%.ResultsIschemia was observed echocardiographically during 7 of 10 left anterior descending coronary artery occlusions but not during left internal thoracic artery occlusion. During ischemia, there were no significant electrocardiographic or hemodynamic changes, whereas large and significant changes in accelerometer circumferential peak systolic (P < .01) and isovolumic (P < .01) velocities were observed. During 13 occlusions, no ischemia was demonstrated by strain, nor was any change demonstrated by the accelerometer. A strong correlation was found between circumferential strain and accelerometer circumferential peak systolic velocity during occlusion (r = −0.76, P < .001).ConclusionsThe epicardial accelerometer detects myocardial ischemia with great accuracy. This novel technique has potential to improve monitoring of myocardial ischemia during cardiac surgery